Robust, Generalized, Quick and Efficient Agglomerative Clustering

نویسندگان

  • Manolis Wallace
  • Stefanos D. Kollias
چکیده

Hierarchical approaches, which are dominated by the generic agglomerative clustering algorithm, are suitable for cases in which the count of distinct clusters in the data is not known a priori; this is not a rare case in real data. On the other hand, important problems are related to their application, such as susceptibility to errors in the initial steps that propagate all the way to the final output and high complexity. Finally, similarly to all other clustering techniques, their efficiency decreases as the dimensionality of their input increases. In this paper we propose a robust, generalized, quick and efficient extension to the generic agglomerative clustering process. Robust refers to the proposed approach’s ability to overcome the classic algorithm’s susceptibility to errors in the initial steps, generalized to its ability to simultaneously consider multiple distance metrics, quick to its suitability for application to larger datasets via the application of the computationally expensive components to only a subset of the available data samples and efficient to its ability to produce results that are comparable to those of trained classifiers, largely outperforming the generic agglomerative process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancement Clustering of Cloud Datasets using Improved Agglomerative Technique

Enhancement Clustering of Cloud Datasets using Improved Agglomerative Technique Prof. Madhuri h Parekh Smt. J.J.Kundaliya Commerce College, Rajkot, Gujarat, India. Email: [email protected] ----------------------------------------------------------------------ABSTRACT------------------------------------------------------------Cloud computing is the latest technology that delivers computing...

متن کامل

Robust Hierarchical Clustering

One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across many different fields ranging from computational biology to social sciences to computer vision in part because their output is easy to interpret. Unfortunately, it is well known, however, that many of the classic agglomerative clustering algorithms are not robust to...

متن کامل

Agglomerative Info-Clustering

An agglomerative clustering of random variables is proposed, where clusters of random variables sharing the maximum amount of multivariate mutual information are merged successively to form larger clusters. Compared to the previous info-clustering algorithms, the agglomerative approach allows the computation to stop earlier when clusters of desired size and accuracy are obtained. An efficient a...

متن کامل

Efficient Clustering and Matching for Object Class Recognition

In this paper we address the problem of building object class representations based on local features and fast matching in a large database. We propose an efficient algorithm for hierarchical agglomerative clustering. We examine different agglomerative and partitional clustering strategies and compare the quality of obtained clusters. Our combination of partitional-agglomerative clustering give...

متن کامل

Divisive Hierarchical Clustering with K-means and Agglomerative Hierarchical Clustering

To implement divisive hierarchical clustering algorithm with K-means and to apply Agglomerative Hierarchical Clustering on the resultant data in data mining where efficient and accurate result. In Hierarchical Clustering by finding the initial k centroids in a fixed manner instead of randomly choosing them. In which k centroids are chosen by dividing the one dimensional data of a particular clu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004